AITopics | logarithmic number

Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

Neural Information Processing SystemsDec-24-2025, 13:06:15 GMT

Despite the great success of deep learning, recent works show that large deep neural networks are often highly redundant and can be significantly reduced in size. However, the theoretical question of how much we can prune a neural network given a specified tolerance of accuracy drop is still open. This paper provides one answer to this question by proposing a greedy optimization based pruning method. The proposed method has the guarantee that the discrepancy between the pruned network and the original network decays with exponentially fast rate w.r.t. the size of the pruned network, under weak assumptions that apply for most practical settings. Empirically, our method improves prior arts on pruning various network architectures including ResNet, MobilenetV2/V3 on ImageNet.

greedy optimization provably win, logarithmic number, name change, (5 more...)

Neural Information Processing Systems

Industry:

Leisure & Entertainment > Games (0.44)
Leisure & Entertainment > Gambling (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Learning Some Popular Gaussian Graphical Models without Condition Number Bounds

Neural Information Processing SystemsDec-24-2025, 05:32:59 GMT

Gaussian Graphical Models (GGMs) have wide-ranging applications in machine learning and the natural and social sciences. In most of the settings in which they are applied, the number of observed samples is much smaller than the dimension and they are assumed to be sparse. While there are a variety of algorithms (e.g. Graphical Lasso, CLIME) that provably recover the graph structure with a logarithmic number of samples, to do so they require various assumptions on the well-conditioning of the precision matrix that are not information-theoretically necessary. Here we give the first fixed polynomial-time algorithms for learning attractive GGMs and walk-summable GGMs with a logarithmic number of samples without any such assumptions. In particular, our algorithms can tolerate strong dependencies among the variables. Our result for structure recovery in walk-summable GGMs is derived from a more general result for efficient sparse linear regression in walk-summable models without any norm dependencies. We complement our results with experiments showing that many existing algorithms fail even in some simple settings where there are long dependency chains.

condition number bound, name change, popular gaussian graphical model, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.63)

Add feedback

Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

Neural Information Processing SystemsOct-11-2024, 05:55:41 GMT

Despite the great success of deep learning, recent works show that large deep neural networks are often highly redundant and can be significantly reduced in size. However, the theoretical question of how much we can prune a neural network given a specified tolerance of accuracy drop is still open. This paper provides one answer to this question by proposing a greedy optimization based pruning method. The proposed method has the guarantee that the discrepancy between the pruned network and the original network decays with exponentially fast rate w.r.t. the size of the pruned network, under weak assumptions that apply for most practical settings. Empirically, our method improves prior arts on pruning various network architectures including ResNet, MobilenetV2/V3 on ImageNet.

artificial intelligence, greedy optimization provably win, machine learning, (5 more...)

Neural Information Processing Systems

Genre: Contests & Prizes (0.85)

Industry:

Leisure & Entertainment > Games (0.40)
Leisure & Entertainment > Gambling (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Learning Some Popular Gaussian Graphical Models without Condition Number Bounds

Neural Information Processing SystemsOct-10-2024, 15:24:52 GMT

Gaussian Graphical Models (GGMs) have wide-ranging applications in machine learning and the natural and social sciences. In most of the settings in which they are applied, the number of observed samples is much smaller than the dimension and they are assumed to be sparse. While there are a variety of algorithms (e.g. Graphical Lasso, CLIME) that provably recover the graph structure with a logarithmic number of samples, to do so they require various assumptions on the well-conditioning of the precision matrix that are not information-theoretically necessary. Here we give the first fixed polynomial-time algorithms for learning attractive GGMs and walk-summable GGMs with a logarithmic number of samples without any such assumptions.

condition number bound, gaussian graphical model, popular gaussian graphical model, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.79)

Add feedback

Reviews: Computationally and statistically efficient learning of causal Bayes nets using path queries

Neural Information Processing SystemsOct-9-2024, 02:43:14 GMT

This paper gives algorithms for recovering the structure of causal Bayesian networks. The main focus is on using path queries, that is asking whether a direct path exists between two nodes. Unlike with descendant queries, with path queries one could only hope to recover the transitive structure (an equivalence class of graphs). The main contribution here is to show that at least this can be done in polynomial time, while each query relies on interventions that require only a logarithmic number of samples. The author do this for discrete and sub-Gaussian random variables, show how the result can be patched up to recover the actual graph, and suggest specializations (rooted trees) and extensions (imperfect interventions).

computationally and statistically efficient learning, path query, transitive structure, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.72)

Add feedback

Procrastination Is All You Need: Exponent Indexed Accumulators for Floating Point, Posits and Logarithmic Numbers

Liguori, Vincenzo

arXiv.org Artificial IntelligenceJun-9-2024

The method comprises two phases: an accumulation phase where the mantissas of the floating point numbers are added to accumulators indexed by the exponents and a reconstruction phase where the actual summation result is finalised. Various architectural details are given for both FPGAs and ASICs including fusing the operation with a multiplier, creating efficient MACs. Some results are presented for FPGAs, including a tensor core capable of multiplying and accumulating two 4x4 matrices of bfloat16 values every clock cycle using ~6,400 LUTs + 64 DSP48 in AMD FPGAs at 700+ MHz. The method is then extended to posits and logarithmic numbers.

mantissa, partial sum register, point number, (15 more...)

arXiv.org Artificial Intelligence

2406.05866

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning low-degree functions from a logarithmic number of random queries

Eskenazis, Alexandros, Ivanisvili, Paata

arXiv.org Machine LearningSep-21-2021

We prove that for any integer n N, d {1,...,n} and any ε,δ (0,1), a bounded function f: { 1,1} We say that f has degree at most d {1,...,n} if ˆ f (S) 0 for every subset S with S d. 1.1. R which is a good approximation of f up to a given error in some prescribed metric.

inequality, polynomial, theorem 1, (15 more...)

arXiv.org Machine Learning

2109.10162

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.29)
North America > United States > California > Orange County > Irvine (0.14)
North America > United States > New York (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates

Yang, Tianbao, Lin, Qihang, Zhang, Lijun

arXiv.org Machine LearningJun-12-2017

This paper focuses on convex constrained optimization problems, where the solution is subject to a convex inequality constraint. In particular, we aim at challenging problems for which both projection into the constrained domain and a linear optimization under the inequality constraint are time-consuming, which render both projected gradient methods and conditional gradient methods (a.k.a. the Frank-Wolfe algorithm) expensive. In this paper, we develop projection reduced optimization algorithms for both smooth and non-smooth optimization with improved convergence rates under a certain regularity condition of the constraint function. We first present a general theory of optimization with only one projection. Its application to smooth optimization with only one projection yields $O(1/\epsilon)$ iteration complexity, which improves over the $O(1/\epsilon^2)$ iteration complexity established before for non-smooth optimization and can be further reduced under strong convexity. Then we introduce a local error bound condition and develop faster algorithms for non-strongly convex optimization at the price of a logarithmic number of projections. In particular, we achieve an iteration complexity of $\widetilde O(1/\epsilon^{2(1-\theta)})$ for non-smooth optimization and $\widetilde O(1/\epsilon^{1-\theta})$ for smooth optimization, where $\theta\in(0,1]$ appearing the local error bound condition characterizes the functional local growth rate around the optimal solutions. Novel applications in solving the constrained $\ell_1$ minimization problem and a positive semi-definite constrained distance metric learning problem demonstrate that the proposed algorithms achieve significant speed-up compared with previous algorithms.

artificial intelligence, machine learning, optimization, (15 more...)

arXiv.org Machine Learning

1608.03487

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Collaborating Authors

logarithmic number

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

Learning Some Popular Gaussian Graphical Models without Condition Number Bounds

Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

Learning Some Popular Gaussian Graphical Models without Condition Number Bounds

Reviews: Computationally and statistically efficient learning of causal Bayes nets using path queries

Procrastination Is All You Need: Exponent Indexed Accumulators for Floating Point, Posits and Logarithmic Numbers

Learning low-degree functions from a logarithmic number of random queries

A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates